The packages of tidyverse help imensely with data cleaning and initial visualization.

library("tidyverse")

We now extract and use the dataset from our sources. first the biggest and messiest dataset!

init.data<- data.frame(read.csv("../data/WEF_TTCR17_data_for_download.csv"))
as_tibble(init.data)
init.data

We can see that this is now viewable as a dataframe . We immediately see that there are a lot of columns and values that we just arent interested in. The analysis will entail tourism spending and other characteristics pertaining to 28 the EU member states. First order would be filter out only the relevant 28 states.

colnames(init.data) <- as.character(unlist(init.data[1,]))
reqCol <- c('Series','Attribute','AUT', 'BEL', 'BGR', 'HRV', 'CYP', 'CZE', 'DNK', 'EST', 'FIN', 'FRA', 'DEU', 'GRC', 'HUN', 'IRL', 'ITA', 'LVA', 'LTU', 'LUX', 'MLT', 'NLD', 'POL', 'PRT', 'ROU', 'SVK', 'SVN', 'ESP', 'SWE', 'GBR')
init.data

To achieve our goal we first convert the header to meaningful data. We then create a column vector with the relevant columns needed and subset the data to get relevant fields for our 28 member states.

reqCol <- c('Series','Attribute','AUT', 'BEL', 'BGR', 'HRV', 'CYP', 'CZE', 'DNK', 'EST', 'FIN', 'FRA', 'DEU', 'GRC', 'HUN', 'IRL', 'ITA', 'LVA', 'LTU', 'LUX', 'MLT', 'NLD', 'POL', 'PRT', 'ROU', 'SVK', 'SVN', 'ESP', 'SWE', 'GBR')
prelimData <- init.data[,reqCol]
prelimData
LS0tDQp0aXRsZTogIkRhdGEgbG9hZGluZyBhbmQgQ2xlYW5pbmciDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQpUaGUgcGFja2FnZXMgb2YgdGlkeXZlcnNlIGhlbHAgaW1lbnNlbHkgd2l0aCBkYXRhIGNsZWFuaW5nIGFuZCBpbml0aWFsIHZpc3VhbGl6YXRpb24uDQoNCmBgYHtyfQ0KbGlicmFyeSgidGlkeXZlcnNlIikNCmBgYA0KDQpXZSBub3cgZXh0cmFjdCBhbmQgdXNlIHRoZSBkYXRhc2V0IGZyb20gb3VyIHNvdXJjZXMuDQpmaXJzdCB0aGUgYmlnZ2VzdCBhbmQgbWVzc2llc3QgZGF0YXNldCENCg0KYGBge3J9DQppbml0LmRhdGE8LSBkYXRhLmZyYW1lKHJlYWQuY3N2KCIuLi9kYXRhL1dFRl9UVENSMTdfZGF0YV9mb3JfZG93bmxvYWQuY3N2IikpDQphc190aWJibGUoaW5pdC5kYXRhKQ0KaW5pdC5kYXRhDQpgYGANCg0KV2UgY2FuIHNlZSB0aGF0IHRoaXMgaXMgbm93IHZpZXdhYmxlIGFzIGEgZGF0YWZyYW1lIC4gV2UgaW1tZWRpYXRlbHkgc2VlIHRoYXQgdGhlcmUgYXJlIGEgbG90IG9mIGNvbHVtbnMgYW5kIHZhbHVlcyB0aGF0IHdlIGp1c3QgYXJlbnQgaW50ZXJlc3RlZCBpbi4gVGhlIGFuYWx5c2lzIHdpbGwgZW50YWlsIHRvdXJpc20gc3BlbmRpbmcgYW5kIG90aGVyIGNoYXJhY3RlcmlzdGljcyBwZXJ0YWluaW5nIHRvIDI4IHRoZSBFVSBtZW1iZXIgc3RhdGVzLg0KRmlyc3Qgb3JkZXIgd291bGQgYmUgZmlsdGVyIG91dCBvbmx5IHRoZSByZWxldmFudCAyOCBzdGF0ZXMuDQpgYGB7cn0NCmNvbG5hbWVzKGluaXQuZGF0YSkgPC0gYXMuY2hhcmFjdGVyKHVubGlzdChpbml0LmRhdGFbMSxdKSkNCmluaXQuZGF0YQ0KYGBgDQoNClRvIGFjaGlldmUgb3VyIGdvYWwgd2UgZmlyc3QgY29udmVydCB0aGUgaGVhZGVyIHRvIG1lYW5pbmdmdWwgZGF0YS4gDQpXZSB0aGVuIGNyZWF0ZSBhIGNvbHVtbiB2ZWN0b3Igd2l0aCB0aGUgcmVsZXZhbnQgY29sdW1ucyBuZWVkZWQgYW5kIHN1YnNldCB0aGUgZGF0YSB0byBnZXQgcmVsZXZhbnQgZmllbGRzIGZvciBvdXIgMjggbWVtYmVyIHN0YXRlcy4NCg0KYGBge3J9DQpyZXFDb2wgPC0gYygnU2VyaWVzJywnQXR0cmlidXRlJywnQVVUJywgJ0JFTCcsICdCR1InLCAnSFJWJywgJ0NZUCcsICdDWkUnLCAnRE5LJywgJ0VTVCcsICdGSU4nLCAnRlJBJywgJ0RFVScsICdHUkMnLCAnSFVOJywgJ0lSTCcsICdJVEEnLCAnTFZBJywgJ0xUVScsICdMVVgnLCAnTUxUJywgJ05MRCcsICdQT0wnLCAnUFJUJywgJ1JPVScsICdTVksnLCAnU1ZOJywgJ0VTUCcsICdTV0UnLCAnR0JSJykNCnByZWxpbURhdGEgPC0gaW5pdC5kYXRhWyxyZXFDb2xdDQpwcmVsaW1EYXRhDQpgYGANCg0K